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Foreword 


This  report  summarizes  the  significant  literature  in  the  area  of 
image  texture  segmentation,  classification,  and  synthesis.  The  intent 
is  to  provide  guidance  and  direction  to  the  approaches  available 
for  image  texture  processing  and  a  measure  of  their  relative  merit. 
The  goal  of  this  effort  is  to  utilize  texture  processing  techniques 
for  the  classification  of  acoustic  provinces  in  sidescan  sonar  imagery. 


W.  B.  Moseley  L.  R.  Elliott,  Commander,  USN 

Technical  Director  Commanding  Officer 


Executive  Summary  This  report  reviews  the  literature  in  the  areas  of  image  texture  segmenta¬ 
tion,  classification,  and  synthesis  methods.  The  approaches  to  these 
areas  are  grouped  into  areas  of  fractal,  spline,  neural  networks,  model¬ 
ing,  and  stochastic  methods.  An  immense  amount  of  literature  in  the 
area  was  reviewed,  and  techniques  with  the  most  merit  are  presented. 
From  the  review  it  appears  that  no  single  approach  provides  a  robust 
texture  analysis  methodology  without  requiring  overwhelming  complexity. 
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1.0  Introduction 


This  report  surveys  the  documents  reviewed  in  the  areas  of  texture 
segmentation,  classification,  and  synthesis.  The  motivation  behind  this 
research  is  to  obtain  an  inventory  of  the  techniques  available,  and  their 
relative  merit.  The  purpose  of  this  research  is  to  assemble  a  core  of 
techniques  that  can  be  applied  to  generic  texture  processing  problems. 
The  goal  of  this  effort  is  to  utilize  texture  processing  techniques  for  the 
classification  of  acoustic  provinces  in  sidescan  sonar  imagery.  This 
section  defines  texture  and  discusses  some  of  the  perceptions  obtained 
from  the  literature  review. 

Modestino  et  al.1  describe  tone  as  the  average  gray  level  of  a  region 
and  texture  as  the  spatial  distribution  of  gray  levels  in  a  region.  They 
further  indicate  that  a  subtle  relationship  exists  between  tone  and  tex¬ 
ture  that  is  highly  dependent  upon  resolution;  tone  dominates  at  both 
high  and  low  resolution  of  a  scene.  They  define  texture  as  “a  basic 
local  order  or  quasi-homogeneous  pattern  that  is  repeated  in  a  nearly 
periodic  manner  over  some  region  large  in  comparison  to  the  local 
pattern  size.”  They  indicate,  and  it  seems  to  be  widely  accepted,  that 
there  are  two  fundamental  approaches  to  texture  discrimination:  structural 
and  statistical. 

In  general,  a  given  texture  problem  requires  a  combination  of  these 
techniques.  Consider  the  following  examples  of  textures.  A  sand  texture 
can  be  handled  with  strictly  statistical  methods  and  a  group  of  parallel 
lines  with  strictly  structural  methods.  Brick  has  a  kernel  that  can  best 
be  described  by  statistical  methods,  but  the  placement  of  this  kernel  is 
best  handled  by  structural  methods.  Straw  has  a  structurally  defined 
kernel  with  a  statistical  placement  rule. 

In  reviewing  the  literature  about  image  texture,  a  perception  develops 
that  there  is  pursuit  for  an  algorithm  that  will  globally  perform  the 
task  of  texture  analysis.  In  the  two  to  three  decades  of  this  research, 
hundreds  of  researchers  have  brought  their  own  special  tools  into  the 
field  from  their  areas  of  expertise.  As  a  consequence,  this  area  is  difficult 
to  enter,  there  is  an  abundance  of  diverse  techniques,  each  with  its  own 
special  language  and  mathematics.  Furthermore,  each  approach  in  turn 
appears  to  suffer  a  similar  fate.  In  the  beginning  each  new  technique  is 
typically  proclaimed  as  a  definitive  solution,  but  shortcomings  are  quickly 
discovered.  An  abundance  of  research  then  ensues  to  overcome  these 
shortcomings.  However,  in  the  process  of  this  fortification,  the  technique 
often  becomes  too  complicated  and  unwieldy  to  use.  The  most  recent 
example  of  this  process  is  demonstrated  by  the  relatively  new  area  of 
fractals.  This  approach,  attributed  to  Mandelbrot,2  is  currently  in  the 
process  of  fortification,  as  evidenced  by  the  efforts  of  Ait-Kheddache 
and  Rajala.3  In  this  1988  paper  they  indicate  that  visually  distinct  textures 
may  be  indiscernible  by  fractal  dimension  (e.g.,  bark  and  pigskin)  and 
proposes  the  use  of  “higher  order”  fractals  for  the  segmentation 
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and  classification  of  texture,  based  on  work  done  by  Hentschel  and 
Procaccia.4 

It  would  seem  that  the  efforts  in  the  area  of  texture  could  be  best 
described  as  a  group  of  techniques  Nk,  which  are  completely  effective 
on  a  corresponding  group  of  images  Ik,  where  each  Ik  is  a  subset  of  the 
whole  image  space  G.  Typically,  each  Nk  technique  is  straightforward 
and  computationally  efficient,  and  is  effective  on  approximately  80%  to 
90%  of  the  images  to  which  it  is  applied.  In  an  attempt  to  modify  Nk 
so  that  it  can  also  contend  with  images  in  the  set  fl  -  Ik,  the  technique 
quickly  grows  to  be  computationally  inefficient,  complicated,  and 
unrecognizable.  An  explanation  for  this  recurring  phenomenon  may  lie 
in  the  fact  that  the  human  vision/recognition  system  is  not  well  under¬ 
stood,  and  the  efforts  in  the  way  of  texture  processing  are  an  attempt 
to  model  and  mimic  this  system. 

The  first  stages  of  this  literature  review  have  established  that  the  best 
method  of  texture  analysis  is,  in  fact,  several  methods.  Almost  any 
particular  texture  analysis  technique  can  likely  be  made  to  handle  most 
images,  but  the  result  is  an  overly  complicated,  time  consuming  method. 
It  would  therefore  seem  more  practical  to  use  a  variety  of  methods  in 
their  simplest  and  most  computationally  efficient  form  as  a  front  end  to 
the  human  vision  system  or  perhaps  to  an  artificial  intelligence  (AI) 
system.  Pursuing  avenues  that  lead  to  complex  and  inefficient  solutions 
is  needless,  since  many  already  exist  and  the  computer  hardware  necessary 
to  handle  these  solutions  does  not.  Review  of  the  literature  has  proven 
fruitful  in  this  regard:  it  reveals  blind  avenues  that  others  have  followed 
and  provides  a  menu  of  techniques  with  various  attributes.  Each  tech¬ 
nique  should  be  judged  primarily  on  its  efficiency  and  simplicity,  and 
different  techniques  should  be  compared  in  regard  to  the  particular 
class  of  imagery  on  which  they  are  effective. 

A  new  technique  for  texture  processing  has  recently  emerged.  Neural 
networks  have  shown  a  great  propensity  for  easily  coping  with  nonlinear 
and  chaotic  phenomena.  Neural  networks  are  conceptually  and  compu¬ 
tationally  straightforward,  but  presently  require  excessive  processing 
time  when  simulated  on  a  conventional  digital  computer.  There  is  a 
great  amount  of  interest  in  the  field,  and  several  vendors  report  that 
specialized  hardware  will  soon  be  available.  Widrow  and  Winter5  have 
already  used  neural  networks  to  create  a  pattern  recognition  classifier 
that  is  insensitive  to  translation,  rotation,  and  changes  in  scale;  these 
operations  have  posed  significant  problems  in  the  area  of  image  processing, 
yet  are  easily  coped  with  by  the  human  visual  system.  Neural  networks 
may  prove  to  be  highly  effective  for  making  decisions  based  on 
information  from  several  different  texture  extraction  preprocessing 
algorithms,  or  it  may  also  be  possible  to  directly  apply  a  neural  network 
to  model  the  phenomena  that  generate  an  image. 

The  following  sections  review  the  more  promising  techniques  and 
indicate  their  relative  merit  Section  2.0  reviews  papers  that  have  done 
comparisons  between  various  algorithms.  Section  3.0  reviews  fractal 
approaches,  and  Section  4.0  discusses  a  paper  on  the  generation  of 
fractals  using  spline  techniques.  Section  5.0  discusses  the  newer  neural 
network  approaches  to  texture  processing,  and  Section  6.0  briefly  reviews 
some  of  the  more  traditional  modeling  and  stochastic  techniques.  Finally, 
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section  7.0  discusses  the  principal  conclusions  drawn  from  this  litera¬ 
ture  review. 

Suiprisingly  few  papers  compare  various  techniques  or  attempt  to 
survey  the  area  of  texture  processing.  Furthermore,  most  textbooks  pro¬ 
vide  only  a  cursory  coverage  of  the  wide  variety  of  the  techniques  that 
have  been  attempted. 

Perhaps  this  deficiency  is  best  explained,  as  it  has  been  so  aptly  put 
by  many,  by  the  fact  that  so  many  of  these  techniques  are  ad  hoc  by 
nature.  Only  two  survey  papers  were  found:  Haralick,6  and  Conners  and 
Harlow.7  Haralick  is  referenced  many  times,  and  his  1979  paper  seems 
to  be  well  accepted  as  a  baseline  summary  of  the  available  techniques 
up  to  that  time.  Haralick  made  a  significant  point  in  referring  to  the 
current  techniques,  in  that  they  typically  emphasized  either  the  tonal 
primitive  properties  or  their  spatial  interrelations,  but  not  both.  In  this 
paper  he  indicated  eight  popular  statistical  approaches  to  texture  analysis: 

autocorrelation  function  —  Given  a  bounded  region,  0 <u<Lx  and 
0  <  v  <  Ly,  where  ( x ,  y)  are  the  x,  y  translations,  and  I  ( u ,  v)  is  the  total 
“energy”  of  the  image  at  position  («,  v),  then  the  autocorrelation  is 
given  by: 


Pky)= 


(z,-|*D(VW) 


J  J  /  (n,v)I(n  +  x,v+y)dudv 

— L_  f  f  P  (u,v)dudv 

J- 


(1) 


Haralick  describes  tonal  primitives  as  regions  with  uniform  tonal 
properties.  If  the  tonal  primitives  are  large,  the  autocorrelation  drops 
off  slowly  with  distance;  if  small  it  drops  off  rapidly.  If  the  tonal 
primitives  are  periodic,  the  autocorrelation  will  also  display  a  periodic 
behavior. 

optical  transforms  —  the  light  amplitude  distribution  at  the  front  and 
rear  focal  planes  of  a  lens  are  Fourier  transforms  of  one  another. 

digital  transforms  —  images  are  typically  divided  into  smaller  areas, 
and  digital  transforms  are  applied.  The  areas  are  then  compared  based 
on  the  transform  characteristics.  Popular  transforms  include  the  discrete 
Fourier  transform  (DFT),  sine  and  cosine  transforms,  Hadamard  trans¬ 
form,  Slant  transform  and  the  Harr  transform. 

textural  edginess  —  fine  textures  have  many  edges  per  unit  area. 

structural  elements  —  for  binary  images,  this  technique  emphasizes 
the  shape  aspects  of  tonal  primitives. 

spatial  gray-tone  co-occurrence  probabilities  —  a  coarse  texture  has 
a  slight  distribution  change  with  distance,  and  a  fine  texture  has  a 
larger  change.  This  method  does  not  capture  shape  aspects  of  the  tonal 
primitives  and  does  not  work  well  for  textures  with  large  area  primitives. 
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A  previous  paper  by  Haralick  et  al.s  showed  how  to  obtain  14  different 
features  from  the  gray-tone  co-occurrence  matrix,  including  entropy, 
maximum  probability,  contrast,  correlation,  inverse  difference  moment, 
and  probability  of  run  length.  The  co-occurrence  matrix  P,  [P  :GxG 
to  [0,1]}  for  an  image  /  and  binary  relation  R  is  given  by: 


P(i,j)= 


no.[((n,6),  (c,d))eR\I  (a,b)=i  and  f(c,d)= j] 
no  .R 


(2) 


gray-tone  run  lengths  —  primitives  for  this  method  are  maximal  collinear 
connected  sets  of  the  same  gray  tone.  A  coarse  texture  will  have  many 
pixels  in  a  gray-tone  run,  and  a  fine  texture  will  have  fewer. 

autoregressive  models  —  these  models  utilize  linear  estimates  of  previous 
gray  tones  to  generate  the  next  gray  tone.  These  models  may  be  causal, 
semicausal,  or  anticausal,  and  often  inject  a  noise  variable.  For  coarse 
textures  the  coefficients  will  be  similar,  for  fine  textures  the  coefficients 
in  the  linear  estimate  will  have  a  wide  variation.  This  approach  is  simple 
and  easy,  but  does  work  well  with  macro  textures. 

Haralick6  further  indicated  that  the  first  three  techniques  measure 
spatial  frequency,  but  that  they  are  not  invariant  under  monotonic 
transformations  of  gray  tone.  For  these  techniques,  fine  textures  will 
have  high  frequencies,  and  coarse  textures  will  have  predominantly  low 
frequencies.  He  further  mentioned  that  Weszka  et  al.9  showed  that  the 
effectiveness  of  these  techniques  were  significantly  poorer  than  other 
approaches. 

Haralick  made  only  little  mention  of  structural  approaches  in  this 
paper.  One  such  technique  is  mosaics,  where  a  picture  is  tessellated 
into  regions  and  gray  levels  are  assigned  to  each  region  based  on  a 
specified  probability  density  function.  Also,  he  indicated  that  for  macro 
textures,  investigators  are  using  histograms  of  primitive  properties  and 
co-occurrence  of  primitive  properties  as  a  generalization  of  the  structural 
and  statistical  approaches. 

Conners  and  Harlow7  made  a  detailed  comparison  of  four  algorithms: 

•  spatial  gray-level  dependence  method  (SGLDM) 

•  gray-level  run  length  method  (GLRLM) 

•  gray-level  difference  method  (GLDM) 

•  power  spectrum  (PSM). 

Conners  and  Harlow  examined  these  algorithms  based  on  the  texture 
information  content  of  the  intermediate  matrices  of  the  processes,  and 
thus  determined  the  relative  sets  over  which  each  algorithm  is  effective. 
The  results  of  their  analysis  follow: 

•  GLRLM  could  not  discriminate  all  visually  distinct  texture  pairs. 

•  There  exists  a  visually  distinct  texture  pair  for  which  GLDM  cannot 
discriminate  for  any  value  of  spacing,  d. 

•  PSM  cannot  discern  all  visually  distinct  texture  pairs. 

•  SGLDM  can  discern  a  larger  class  of  textures  than  GLRLM,  even 
when  only  a  sample  spacing  of  1  is  used.  SGLDM  is  also  more  powerful 
than  GLDM  and  PSM. 

•  GLDM  is  more  powerful  than  PSM. 
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•None  of  these  algorithms  could  discriminate  between  a  Markov 
texture  and  a  180  degree  rotation  of  the  texture. 

•  The  GLRLM  suffers  from  noise  sensitivity. 

•  SGLDM  and  GLDM  work  better  when  multiple  intersample  spacing 
distances  are  used. 

•  SGLDM  is  much  more  powerful  than  the  PSM. 

Weszka  et  al.9  analyzed  these  same  algorithms,  but  Conners  and  Harlow7 
indicate  that  the  comparison  methods  used  by  Weszka  et  al.  were  not 
general  enough  to  completely  judge  the  algorithm’s  performance.  However, 
both  papers  drew  the  same  conclusions  on  the  relative  power  of  these 
four  algorithms. 


A  fractal  is  an  object  with  the  property  that  it  is  self-similar  at  various 
scales  of  magnification.  Consequently,  such  an  object  has  a  fractional 
dimension  and  its  power  spectrum  is  a  function  of  1/frequency.  The 
classical  example10  is  the  measurement  of  a  shoreline,  where  the  length 
of  the  shoreline  increases  as  the  length  of  the  measuring  device  decreases. 
This  effect  can  be  stated  as  follows:  given  a  yardstick  of  length  L,  the 
measurement  of  an  n-dimensional  surface  is  given  by  M  =  nL°,  where 
D  is  the  topological  dimension  of  the  yardstick.  Given  a  fractal  surface,  D 
is  the  fractional  power  that  yields  a  consistent  measure  M  for  all  sizes 
of  the  yardstick  L.10  A  more  rigorous  definition  is:  A  random  function 
/(x )  is  a  fractal  Brownian  function  if  for  all  x  and  Ax 


Prob. 


/(x+Ax)-/(x) 


\ 


Ax 


H 


<y 


=  F(y), 


(3) 


where  the  spectral  density  of  a  fractal  Brownian  function  is  proportional 
to/-2"'1. 

Fractals  and  fractal  dimension  have  proven  to  be  extremely  valuable 
in  texture  analysis  and  synthesis. 10,n’12,13  They  have  been  the  only  effective 
method  for  generating  realistic-looking  terrain,  clouds,  and  many  other 
objects  that  occur  in  nature.  The  use  of  fractal  dimension  has  proven  to 
be  highly  effective  in  the  segmentation  and  classification  of  many  textures. 
Perhaps  their  greatest  attribute  is  the  simplicity  and  numerical  efficiency 
of  their  algorithms.  Mandelbrot2  is  acknowledged  to  be  the  father  of 
fractal  mathematics,  although  his  work  may  be  hard  to  follow  and  utilize. 
Books  by  Barnsley  et  al.,11  Barnsley,12  and  Peitgen  and  Richter13  seem 
to  be  much  more  functional.  The  Science  of  Fractal  Images 11  is  a 
collection  of  papers  and  lecture  notes  from  many  of  the  major  researchers 
in  the  area,  and  includes  several  ready-to-implement  algorithms.  Fractals 
Everywhere 12  is  an  excellent  treatment  of  the  mathematics  involved  and 
is  easy  to  follow.  The  Beauty  of  Fractalsn  takes  more  of  a  dynamical 
systems  approach,  looking  deeper  into  the  chaotic  phenomena  that  generate 
fractal  shapes. 

In  his  paper,  Brammer14  discusses  many  of  the  uses  and  character¬ 
istics  of  fractals  and  fractal  dimension. 
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•  It  is  generally  true  that  the  image  of  a  fractal  set  is  fractal  with  a 
direct  relationship  between  the  dimensions. 

•  The  fractal  dimension  at  multiple  resolutions  can  be  used  to  determine 
object  ranges  in  imagery. 

•  The  fractal  sum  of  pulses  method  has  been  used  for  cloud  analysis 
and  forecasting. 

•  Fractal  dimension  is  used  for  detecting  manmade  objects  in  natural 
scenes  and  for  the  automated  detection  of  cracks  in  industrial  applications. 

•  Fractal  techniques  are  being  used  for  image  compression.  The  Peano 
scan  method  achieves  a  bit  rate  of  less  than  one  bit  per  pixel.  Barnsley 
et  al.11  are  working  on  an  iterated  function  method  of  image  compression 
that  would  give  supervised  compression  ratios  on  the  order  of  10,000:1, 
and  automated  ratios  of  125:1. 

Fournier  et  al.15  provide  algorithms  that  are  a  modification  of 
Mandelbrot’s  techniques,  but  are  more  computationally  efficient.  They 
explain  that  the  shear  displacement  process  requires  0(N3)  operations, 
that  the  modified  Maikov  process  requires  0(N  log(AO)  operations,  and  that 
the  inverse  Fourier  transform  requires  0(N  log(AO)  operations.  Fournier 
et  al.  provide  algorithms  in  Pascal  for  fractal  line  and  surface  genera¬ 
tion  that  require  only  O(N)  operations.  They  note  that  the  generated 
objects  are  not  stationary,  isotropic,  or  self-similar,  but  that  they  are 
realistic  looking. 

Pentland10  is  frequently  referenced  by  other  researchers,  and  in  this 
paper  he  derives  the  relationship  between  the  fractal  dimension  of  a 
surface  and  the  fractal  dimension  of  its  image.  He  shows  that  a  three- 
dimensional  (3-D)  surface  with  a  spatially  isotropic  fractal  Brownian 
shape  produces  an  image  whose  intensity  surface  is  fractal  Brownian  and 
whose  fractal  dimension  is  identical  to  that  of  the  components  of  the 
surface  normal,  given  a  Lambertian  surface  reflectance  and  constant 
illumination  and  albedo.  Thus,  if  the  surface  is  fractal,  so  will  be  the 
image.  Furthermore,  the  fractal  dimension  of  imaged  contours  is  the  same 
as  that  of  the  3-D  contour,  and  the  surface’s  dimension  is  1  plus  the 
contour’s  dimension.  To  determine  if  an  image  exhibits  a  fractal  nature, 
Pentland  suggests  that  histograms  over  multiple  scales  can  be  used. 
Using  this  method,  if  the  standard  deviation  of  the  histograms  are  nearly 
linear  versus  scale,  then  the  image  is  fractal-like.  An  important  charac¬ 
teristic  noted  in  this  paper  is  that  the  fractal  dimension  of  regions  that 
contain  a  boundary  is  typically  less  than  the  topological  dimension. 
Although  this  result  is  erroneous,10  it  can  be  used  successfully  for  edge 
detection  in  an  image. 

In  this  paper,  Pentland  also  reports  the  results  of  a  fractal  dimension 
texture  segmenter,  which  yielded  a  classification  accuracy  of  84.4%  in 
contrast  to  65%  for  correlation  statistics  and  72%  for  co-occurrence 
statistics.  He  used  the  power  spectrum  method  of  determining  fractal 
dimension  given  by  log (/*(/))  =  -  (2 H  +1)  log(/)  +  k,  where  P(J) is  the 
power  spectrum,  and  the  fractal  dimension  is  given  by  2-H.  In  this 
test  he  used  8  •  8  pixel  blocks,  and  he  indicated  that  die  results  typi¬ 
cally  proved  effective  over  scales  of  4:1  and  in  some  cases  as  much  as 
8:1.  Pentland  also  indicated  that  one  of  the  shortfalls  of  fractals  is  that 
they  do  not  describe  regular  or  large-scale  spatial  structures.  However, 
a  possible  method  for  handling  this  situation  is  first  to  detect  the  edges 
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in  the  image  and  then  to  analyze  the  nonedge  regions.  He  also  points 
out  that  fractals  are  strictly  an  abstraction  and  that  physical  objects  will 
behave  like  fractals  only  over  a  range  of  parameters;  researchers  seem 
to  have  often  overlooked  this  point  in  their  discussions  of  the  use  of 
fractal  dimension  for  image  analysis. 

Keller  et  al.16  describe  an  unsupervised,  low  computational  segmentation 
routine  that  uses  an  improvement  on  the  lacuanty  feature  introduced  by 
Mandelbrot.  The  term  lacuanty  is  used  to  describe  the  characteristic  of 
fractals  that  have  the  same  fractal  dimension  but  exhibit  different  textures. 
Keller  et  al.  indicate  that  fractal  dimension  alone  is  generally  insuffi¬ 
cient  to  classify  natural  textures,  and  that  natural  fractal  surfaces  typically 
exhibit  statistical  self- similarity  vice  deterministic  self-similarity.  In 
this  paper,  thr  segmentation  was  performed  by  clustering,  and  the  box 
dimension  was  used  to  estimate  the  fractal  dimension.  The  paper  includes 
the  algorithms  for  calculating  the  box  dimension  and  the  technique  for 
K-means  segmentation.  Keller  modified  the  box-c  anting  method 
for  dimension  estimation,  thereby  producing  the  “interpolation”  method. 
This  method  helped  to  overcome  the  quantization  effects  by  interpolating 
between  the  center  point  of  a  cube  and  each  of  its  neighbors.  The 
results  of  the  segmentation  were  excellent,  indicating  that  this  technique 
shows  great  promise. 

Margerum  and  Werkheiser17  provide  code  written  in  LISP  for  generating 
fractal  landscapes.  They  included  several  generated  images  and  demon¬ 
strated  the  effects  of  parameter  change  in  the  generation  algorithm.  The 
scene  generation  involved  using  a  coarse  elevation  grid  and  the  formula 
Entw  =  Ean  +  Rd3~D,  where  EMW  is  the  new  elevation  of  the  center 
point  of  a  square  on  the  grid,  Eavg  is  the  average  of  the  four  comer 
values  of  the  square,  R  is  a  random  number  from  a  Gaussian  distribution, 
d  is  the  distance  from  the  center  point  of  the  square  to  a  vertex,  and  D 
is  the  fractal  dimension.  A  fractal  dimension  near  2  provided  the  most 
realistic-looking  results. 

Vemazza18  provided  a  much  needed  comparison  of  fractal-dimension 
estimators.  He  compared  the  results  of  four  approaches  to  measuring 
the  fractal  dimension  of  an  image:  the  spatial  Pentland  approach,  the 
frequential  Pentland  approach  (using  the  power  spectrum),  the  blanket 
approach,  and  the  Euclidean  approach.  For  the  test  he  used  an  image 
with  fractal  dimension  of  1.2  and  an  8  •  8  mask  size.  Once  the  fractal 
dimension  of  each  block  was  computed,  the  image  was  segmented  using 
a  histogram  of  the  fractal  dimension.  The  best  results  were  given  by  the 
spatial  Pentland  method  and  the  blanket  approach,  with  the  blanket 
approach  providing  the  best  estimate  of  fractal  dimension.  The  frequen¬ 
tial  Pentland  approach  and  the  Euclidean  approach  produced  very  jagged 
curves,  requiring  linear  estimation  to  determine  the  fractal  dimension. 
The  procedure  for  the  blanket  approach  is  as  follows: 

An  image  is  covered  with  an  upper  surface  u(z,i,j)  and  a  lower 
surface  b(e,  i,j).  Defining  the  gray-level  image  g(i,J)  =  u(0,  i,J)  =  b( 0,  i,j) 
at  the  beginning,  with  £  =  1 ,  2,  3 . the  blanket  surfaces  are 


u(e,  i,  j)  =  max  {u(e-  1,  i,  j)  +  1,  max  n(e-  1,  m,  n)} 

for  I (m,  n)  -  (i,  j)\  £  1  (4) 
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and 


b(e ,  i,  j)  =  min  [b(e-  1,  i,  j)-  1,  min  h(e-  1,  m,  n)} 
for  I (m,  n)  -  (/,  j)\  <  1  , 


(5) 


where  the  image  points  (m,  n )  are  the  four  neighboring  points  of  (r,  j). 
The  blanket  volume  is  given  by 


V(e)  =  £  u(e,  i,j )  -  b(e,  i,j) 
ij 


(6) 


and  the  surface  area  is 


A(e)  =  (V(e)-V(e-  l))/2 . 


(7) 


Thus,  the  surface  area  A  is  computed  for  different  e  values  and,  since 

A(e)  =  Xe2~D,  D  can  be  computed  in  the  bilog  plane  as  an  estimate  of  ^ 

the  linear  regression. 

Peleg  et  al.19  introduce  the  concept  of  fractal  signature.  They  describe 
the  fractal  signature  as  the  change  in  measured  image  surface  area  with 
a  change  in  scale.  Using  the  blanket  method  given  above  by  Vemazza, 
they  define  the  fractal  signature  as  5(e)  =  2-D.  For  a  truly  fractal 

object,  S  is  invariant  with  changes  in  e.  For  a  nonfractal  surface,  the  • 

magnitude  of  the  fractal  signature  5(e)  indicates  the  amount  of  informa¬ 
tion  that  is  lost  for  a  given  yardstick  e.  A  high  value  of  5(e)  for  a  small 
e  indicates  the  presence  of  high  frequencies  in  the  image,  and  high 
value  of  5(e)  for  a  large  e  indicate  the  presence  of  low  frequencies  in 
the  image. 

Thus,  5(e)  directly  provides  information  about  the  fineness  or  coarseness  ® 

of  a  texture.  Using  the  Peleg  et  al.  method,  textures  are  compared  via 
their  fractal  signatures: 


0(i,j)=l(Si(e)-SJ 

e 


(8) 


where  the  log  weighting  is  due  to  the  unequal  spacing  between  points 
in  the  log-log  scale.  They  indicate  that  this  technique  required  only  a 
small  number  of  texture  descriptors  and  used  e  =  2  to  49  in  this  paper. 

Apparently  prompted  by  Mandelbrot’s  comments  that  coastline 
measurement  yielded  different  results  depending  on  whether  the  mea¬ 
surement  is  made  on  the  land  side  or  the  water  side,  Peleg  et  al.  defined 
a  two-sided  volume  and  area  for  the  blanket  technique.  The  new  volume 
and  area  measures  are  given  as 

vt  ~  I  j )  -  50.  j )) .  (9) 

ij 
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(10) 


v: 


£  (g(iJ)-bt(iJ))  > 
ij 


• 

A+(e)=  V£+-V£+_,, 

(11) 

'a? 

i 

'a? 

II 

/— *S 

& 

1 

(12) 

with  corresponding  fractal  signature  measures  S +  and  S' 
tance  measure  for  comparing  textures  is  given  by 


The  new  dis- 


0(j.y)  =  l|^(5,+(e)-5; 


(e))2  +  (S~  (e)  -  S~  (e))2 


WHS] 


•  (13) 


Testing  with  these  new  measures  revealed  that  graphs  of  S~  represented 
the  shapes  of  objects  in  an  image,  and  S~  is  the  same  for  two  different 
images  with  the  objects  arranged  in  different  positions.  The  measure  S+, 
which  represents  the  background  of  the  image,  produced  different  results 
for  different  object  arrangements.  Peleg  et  al.  also  mention  that  these 
algorithms  can  be  modified  so  that  the  blanket  growth  direction  is 
directional  and  thus  sensitive  to  directional  textures.  This  paper  clearly 
indicates  that  fractal  measures  can  provide  much  significant  information 
even  for  nonfractal  images. 

Arduini  et  al.20  extend  the  Peleg  et  al.  work  by  providing  an  adaptive 
method  of  choosing  the  optimal  mask  size  for  determining  the  fractal 
dimension  of  an  image  region.  Arduini  et  al.  note  that  a  large  mask  will 
tend  to  smooth  the  fractal  dimension  D  in  the  area,  but  a  small  mask 
will  be  very  sensitive  to  noise.  Furthermore,  a  large  mask  allows  deter¬ 
mination  of  D  to  a  higher  accuracy,  but  a  small  mask  provides  for  better 
spatial  resolution.  They  also  comment  that  when  using  the  blanket  method, 
the  surface  area  A(t)  is  given  by  Mandelbrot  as  A(e)  =  V(e)/2  e,  and  is 
given  by  Peleg  et  al.  as  A(e)  =  (V(e)  -  V(e  -  l))/2.  They  indicate  that 
the  Peleg  et  al.  method  is  less  noise  sensitive  and  that  the  Mandelbrot 
method  gives  a  more  precise  value.  The  Arduini  et  al.  method  involves 
using  a  20  •  20  mask  over  an  image,  and  subdividing  this  mask  into  a 
set  of  10  smaller  masks  of  various  sizes  and  shapes.  For  each  of  these 
smaller  masks,  they  divide  the  mask  into  5  •  5  blocks,  compute  the 
fractal  dimension  for  each  block,  and  compute  the  variance  of  the  fractal 
dimension  for  these  5*5  blocks.  Thus,  the  best  mask  to  use  for  the 
20  •  20  region  will  be  the  one  with  lowest  variance  in  fractal  dimension 
among  its  5  •  5  blocks. 

For  a  test  of  this  method,  a  fractal  image  of  dimension  2.7  was 
used  with  a  patch  having  a  fractal  dimension  of  2.3.  The  patch  was  not 
visually  discernible,  yet  the  segmentation  algorithm  easily  detected  the 
patch.  In  comparison  to  other  fractal  estimation  techniques  using 
a  5  •  5  mask,  a  10  •  10  mask,  and  a  region  growing  technique,  the 
Arduini  et  al.  method  provided  the  best  results,  particularly  in  its 
preservation  of  the  edges  of  the  patch.  They  indicated  that  the 
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FORTRAN-implemented  algorithm  took  6  hours  for  a  256  •  256  image 
on  a  Hewlett-Packard  HP 1000  computer. 

Ait-Kheddache  and  Rajala21  extend  fractal  techniques  to  encompass 
a  larger  class  of  images,  i.e.,  to  include  images  that  contain  different 
textures  with  the  same  fractal  dimension.  They  use  the  generalized 
dimensions  of  fractals  derived  by  Hentschel  and  Procaccia,4  where  the 
generalized  dimension  Dy  is  given  by 


lim 
/  ->  o 


log  {£  .  =  o  Pi  exp  (/-  l)logpt} 
log  (/) 


(14) 


for  any  />  0.  Hentschel  and  Procaccia  have  proven  that 


lim  Df  =  Z>o  =  fractal  dimension  , 

/-» o  7 

(15) 

lim  Df  =D ,  =  information  dimension  , 

/->  i 

(16) 

lim  Df  =  D2  -  correlation  dimension  . 

/-*  2 

(17) 

Ait-Kheddache  and  Rajala  present  images  of  bark  and  pigskin  as  an 
example  of  textures  that  cannot  be  discerned  by  fractal  dimension  alone. 
In  their  tests  they  used  the  lower  three  generalized  dimensions  ( D0 ,  Dj, 
and  D2)  to  classify  textures.  Their  results  were  100%  classification  for 
water/grass,  water/sand,  and  bark/pigskin,  and  81%  for  grass/sand.  The 
methods  for  computing  the  information  and  the  correlation  dimension 
are  a  trifle  complicated,  but  are  explained  in  detail  by  Hentschel  and 
Procaccia.4 


While  splines  are  typically  a  tool  for  graphics  and  for  data  interpo¬ 
lation,  the  paper  by  Szeliski  and  Terzopoulos22  is  pertinent  to  texture 
processing  in  that  it  provides  a  novel  approach  to  generating  fractal 
curves.  Two  good  references  for  spline  fundamentals  are  Ahlberg  et  al.23 
and  Bartels  et  al.24  Typically,  a  spline  is  a  cubic  function  that  is  fit  to 
a  series  of  points  to  form  a  smooth  curve.  These  functions  are  typically 
smooth  in  the  first  and  second  derivatives,  but  have  a  jump  in  the  third 
derivative.  Techniques  exist  for  both  interpolation  (exact  fit  to  a  data 
set)  and  for  approximation,  where  the  maximum  distance  is  specified 
between  the  points  and  the  spline.  The  fundamental  theorem  of  splines, 
due  to  Holladay  (1957),  is  given  by  Ahlberg  et  al.23 

Given  a  set  of  x,  on  the  interval  [a,  b ]  and  a  corresponding  set  y„ 
then  of  all  functions  fix)  having  a  continuous  second  derivative  on 
[a,  b ]  such  that  /(x,)  =  y,,  the  spline  function  S(/;  x)  with  junction  points 


4.0  Spline  Techniques 
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at  the  xt  and  with  its  second  derivative  equal  to  0  at  x  =  a,  b  minimizes 
the  integral 


(18) 


In  their  paper,  Szeliski  and  Terzopoulos  mention  that  splines  are 
easily  constrained  and  well  suited  for  modeling  smooth  objects,  and 
that  fractals  are  more  suitable  for  generating  irregular  shapes  but  are 
difficult  to  constrain.  They  indicate  that  the  generation  of  fractals  using 
Fourier  methods  results  in  fractals  that  cannot  be  controlled  locally, 
and  that  the  standard  perturbation  method  results  in  a  nonstationary 
process.  To  overcome  these  weaknesses, '  Szeliski  and  Terzopoulos 
employed  a  general  class  of  multivariate  spline  models  called  controlled 
continuity  splines,  which  afford  local  control  over  smoothness.  Using 
this  spline  model,  they  injected  white  noise,  and  the  smoothing  effect 
of  the  spline  spread  this  noise  spatially  to  produce  a  fractal-like  curve 
while  retaining  local  control  of  the  curve.  He  applied  this  technique  to 
synthesize  realistic  terrain  from  sparse  elevation  data  with  very  impressive 
results. 


5.0  Neural  Networks 


Neural  network  technology  is  the  newest  player  in  the  texture  analy¬ 
sis  arena,  and  it  appears  to  have  a  great  propensity  for  modeling  nonlinear 
and  chaotic  phenomena.  A  neural  network  is  formed  by  connecting  one 
or  more  layers  of  neurons,  where  each  neuron  performs:25 


= g 


I  TijXj 


tnp 


+ 


(19) 


where  X‘np  are  the  inputs  to  the  neuron,  X°ut  is  the  neuron’s  output, 
Tij  are  the  neuron’s  weights,  0  is  a  constant,  and  g(x)  is  a  nonlinear 
function,  typically  a  sigmoidal  form.  The  network  is  “trained”  to  behave 
like  a  particular  system  by  feeding  it  an  input/output  sequence  of  the 
system  to  be  mimicked.  If  t  are  the  training  output  values  for  the  p,h 
input  pattern,  and  O  are  the  actual  output  values,  the  network  is  trained 
by  minimizing 

E  =  1 1  f'SP) "  0}p>)  •  (2°) 

pi''  ' 

where  i  indexes  the  number  of  neurons  in  the  output  layer.  An  iterative 
procedure  for  training,  known  as  the  backpropagation  method,  is  given 
by  Lapedes  and  Farber25  as  follows: 


ATij^leSfp)0}p) 

P 


(21) 
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and 


A0,  =  e£  8,-p) , 
P 


(22) 


where,  for  a  neuron  in  the  output  layer, 


(<!'’-  o!p,) 

0jp) 

1  -  o\p) 

V  ) 

v  / 

(23) 


and,  for  a  neuron  in  a  hidden  layer. 


(24) 


This  procedure  involves  first  computing  8,  for  the  output  layer  and  then 

using  the  previous  equation  to  compute  8(  for  the  hidden  layers.  • 

Lapedes  and  Farber25  proposed  that  a  large  class  of  functions  of  the 
form  Rn  mapping  to  Rm  can  be  accurately  approximated  using  only  two 
hidden  layers  of  neurons.  They  indicated  that  most  signal  processing 
tests  cannot  distinguish  between  chaotic  behavior  (nonlinear  systems) 
and  stochastic  noise,  and  showed  the  ability  of  a  neural  network  to 
model  the  Glass-Mackey  equation,  which  exhibits  chaotic  behavior.  For 
this  test  they  used  g(x)  =  1/2(1  +  tanA(x))  as  the  sigmoidal  function. 

Using  the  results  of  Takens26  their  system  used  four  input  nodes,  since 
the  Glass-Mackey  equation  generates  a  strange  attractor  with  dimen¬ 
sion  3.5. 

In  an  earlier  paper,  Lapedes  and  Farber27  showed  through  testing  that  # 

neural  networks  are  able  to  predict  points  in  a  chaotic  time  series  with 
orders  of  magnitude  greater  accuracy  than  conventional  methods,  such 
as  the  Linear  Predictive  method  and  the  Gabor-Volterra-Weiner 
Polynomial  method.  They  state  that  neural  networks  perform  well,  since 

they  globally  approximate  a  system ’s  mapping  by  performing  a  generalized  q 

mode  decomposition.  They  mention  that  the  accuracy  of  the  network 

can  be  improved  by  increasing  the  number  of  neurons  in  the  hidden 

layers  (the  layers  between  the  input  and  output  nodes).  For  the  Mackey- 

Glass  equation  modeling,  they  used  a  two-layer  neural  network  with 

10  neurons  in  each  hidden  layer.  This  system  required  about  30  to 

60  minutes  training  time  on  a  Cray  X-MP  computer.  They  also  presented  ® 

an  interesting  analogy  between  neural  networks  and  Fourier  analysis,27 
which  is  summarized  in  the  following  paragraph. 

Consider  the  sum  of  two  sigmoids:  ajg  (btx  +  C|)  +  a2g  (b^x  +  c£\  it 
is  seen  that  b  adjusts  the  slope  of  the  sigmoids,  c  adjusts  the  shifts,  and 

a  adjusts  the  gain.  If  we  let  g  be  sinusoid,  then  a  acts  like  a  Fourier  • 

amplitude,  b  like  frequency,  and  c  like  phase  shift  The  a’s  are  the 
synaptic  weights  of  the  hidden  to  output  layer,  the  b's  are  synaptic 
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weights  of  the  input  to  the  hidden  layer,  and  the  number  of  g  functions 
is  the  sum  of  number  of  hidden  units  in  the  hidden  layer.  The  number 
of  adjustable  frequencies  (using  sine  vice  sigmoid)  is  thus  determined 
by  the  number  of  neurons  in  the  hidden  layer. 

Lapedes  and  Farber27  also  presented  a  convenient  scaling  methodology, 
where  the  network  is  “trained”  using  an  input/output  sequence  in  the 
range  of  0  to  1.  Once  the  network  has  completed  training,  the  network’s 
weights  can  then  be  scaled  to  handle  inputs  of  an  arbitrary  range. 

Many  successful  neural  network  applications  to  image  processing 
have  been  reported.  Widrow  and  Winter5  used  a  neural  network  to 
produce  a  pattern  recognition  classifier  that  is  insensitive  to  translation, 
rotation,  and  scale  changes.  Wilson28  used  neural  networks  as  a  voter 
for  pattern  recognition.  Wilson  applied  vector  morphology  to  the  texture 
analysis  problem  and  used  neural  networks  for  the  voting  logic  involved 
in  determining  the  “fit”  of  image  areas  to  the  set  of  possible  structuring 
elements.  Haykin  and  Leung29  successfully  modeled  radar  sea  clutter 
using  a  two-layer  neural  network.  Glover30  described  a  system  built  for 
assembly  line  automatic  inspection.  The  system  consists  of  a  video¬ 
input  optical/electronic  Fourier  feature  extraction  module  and  a  PC/AT 
with  plug-in  neural  net  board  (Hecht-Nielson  AZ1000  ANZA 
neurocomputer)  for  feature  signature  classification.  This  system  performs 
global  shape  and  texture  analysis  at  speeds  up  to  15  images  per  second. 

Tenorio  and  Hughes31  discussed  a  system  that  uses  a  Markov  image 
model,  where  Markov  fields  and  approximate  maximum  a  priori 
probabilities  are  input  to  a  neural  network  that  is  used  to  segment  the 
image.  The  system  is  invariant  to  rotation,  scaling,  position,  translation, 
and  multiplicity  of  objects.  They  indicated  that  the  system  correct ly 
segments  images  regardless  of  the  number  of  objects  or  their  size, 
providing  the  objects  are  within  the  knowledge  base  of  the  network. 
This  system  will  thus  misinterpret  unknown  objects. 

Mesrobian  and  Skrzypek32  discussed  a  multilevel  neural  network 
approach  to  the  discrimination  of  natural  textures.  Their  proposed  system 
will  consist  of  three  functional  layers.  The  first  layer  is  the  feature 
extraction  network,  consisting  of  parallel  elements  to  extract  edges, 
line  segments,  line  terminators,  and  comers  from  the  image.  The  second 
layer  is  the  local  boundary  detection  layer  that  locates  the  perimeter  of 
regions  with  uniform  texture  properties.  The  third  and  highest  level 
layer  is  the  higher  order  discrimination  network  that  attempts  to  segment 
the  textured  images  with  a  higher  level  of  complexity.  This  level  is 
based  on  the  premise  that  grouping  mechanisms  employed  in  the 
discrimination  of  simple  textures  can  also  be  used  to  discriminate  textures 
of  greater  structural  complexity. 

Manjunath  et  al.33  performed  a  comparison  of  texture  segmentation 
algorithms  using  a  Maikov  random  field  model,  implemented  on  Hopfield 
neural  networks.  Segmentation  tests  were  performed  using  an  image 
consisting  of  six  Brodatz34  textures.  The  following  misclassification 
error  results  were  obtained: 

•  Maximum  likelihood  estimate  —  22.2% 

•  Neural  network  with  maximum  likelihood  estimate  as  initial  state  — 
16.3% 

•  Neural  network  with  random  initial  state  —  14.7% 
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•  Neural  network  with  simulated  annealing  —  6.7% 

•  Maximizing  the  posterior  marginal  distribution  —  7.1% 

•Neural  network  with  stochastic  learning  —  8.7% 

•  Hierarchial  network  (coarse  to  fine) —  8.2%. 

They  indicated  that  although  simulated  annealing  worked  the  best, 
hundreds  of  iterations  were  required.  Simulated  annealing  is  a  method 
that  allows  the  system  to  diverge  in  a  controlled  fashion  to  prevent  the 
solution  from  becoming  trapped  in  a  local  minimum.  The  maximizing 
posterior  marginal  distribution  rule  also  performed  well  but  required 
hundreds  of  iterations.  Reference  33  contains  more  details  on  the  individual 
approaches  used. 

Neural  networks  have  two  extremely  attractive  features:  they  are 
conceptually  simple,  and  they  utilize  a  highly  parallel  approach  that  is 
well  suited  for  parallel  processing.  Many  elaborate  neural  network  soft¬ 
ware  packages  are  already  commercially  available,  and  neural  network 
hardware  will  soon  be  available  that  will  allow  the  implementation  of 
high-speed  networks.  Neural  networks  may  provide  two  promising  avenues 
to  the  texture  analysis  problem.  One  method  will  use  traditional  texture 
analysis  methods  to  extract  feature  vectors  from  an  image  and  then  use 
a  neural  network  to  make  inferences  based  on  these  vectors.  Another 
method  will  use  a  neural  network  to  directly  model  the  image  by  using 
the  weights  of  the  network  as  the  feature  vector  of  the  image. 


Many  other  techniques  have  been  proposed  and  attempted  with  various 
degrees  of  complexity  and  fairly  uniform  degree  of  success.  These 
techniques  can  be  roughly  divided  into  modeling  techniques,  where  a 
model  is  specified  for  the  generation  or  analysis  of  texture,  and  stochastic 
techniques,  where  image  statistics  are  used  for  texture  identification. 
This  research  focused  on  the  newer  techniques  of  fractals  and  neural 
networks,  although  a  few  modeling  and  stochastic  techniques  were  covered 
in  the  process.  The  following  two  sections  describe  the  papers  that 
present  these  types  of  techniques. 

Bovik  et  al.35  discussed  texture  analysis  using  local  spatial  filters, 
i.e.,  the  Gabor  function  given  by: 


h(x,  y)  =  g(x\  y')  exp[27t j(Ux  +  Vy)] , 


(25) 


where  ( x’,y ')  =  (x  cos  0  +  y  sin  0,  -x  sin  $  +  y  cos  $)  and 


g(x,  y)  =  1/2 n\a2  exp 


fr/x f+y2 

2o2 


With  their  method,  tunable  Gabor  filters  are  used  to  model  an  image, 
and  the  segmentation  is  performed  using  channel  amplitude  and  phase 
comparisons.  The  Gabor  filters  have  tunable  orientation  and  radial 
frequency  bandwidths,  making  them  well  suited  for  this  purpose.  They 
indicated  that  the  channel  amplitude  response  can  be  used  to  detect 


6.0  Modeling  and  Stochastic 
Techniques 


6.1  Modeling 
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boundaries  between  textures  and  that  large  variations  in  the  channel 
phase  response  provide  a  way  to  detect  discontinuities  in  texture  phase. 
The  resulting  segmentation  achieved  a  good  resemblance  to  visual  per¬ 
ception.  The  research  performed  in  this  paper  was  directed  by  physiological 
evidence  that  Gabor-shaped  receptive  fields  are  fundamental  to  the 
biological  processing  of  texture. 

Chellappa  and  Kashyap,36  Chellappa  and  Shankar,37, 38  Chellappa 
et  al.,39  and  Chellappa40,41  discuss  the  use  of  a  two-dimensional  noncausal 
autoregressive  model  for  the  synthesis  and  classification  of  texture. 
Chellappa  and  Kashyap36  began  with  a  Gaussian  Markov  random  field 
model  given  by 


y(s)  =  x  +  d  +  Vp©ts) .  (26) 

reN 

and  to  simplify  image  synthesis  they  modified  the  model  to  be 


y(s)=  X  0ry(*®/)+Vp©(s),je£2,  (27) 

nzN 

where  Q  =  [(/,  j),  0  <  i,j  <  M-  1],  0,  and  p  are  the  model  parameters, 
to (s)  is  a  Gaussian  noise  signal,  and  N  is  the  neighbor  set  of  pixels  for 
the  pixel  y(s).  ©  is  the  sum  modulo  M  operator.  The  modified  model 
is  said  to  have  nearly  the  same  second-order  properties  as  the  ideal 
model  for  a  large  image.  The  set  of  M2  equations  can  be  represented  in 
a  matrix  vector  form  as 

fi(6) y  =  Vp©  ,  (28) 

where  B(Q)  is  a  block  circulant  matrix,  and  y  and  ©  are  M 2  vectors 
derived  from  the  arrays  y(s )  and  to(s).  Let  the  eigenvalues  of  B(Q)  be 
given  by  p.  If 


Mj  =  (1  - 20ry,), 


(29) 


where  ys  =  col. 

«pf-V=T2JLJ’>' 

,reN 

V  M  J 

. 

,  then  an  image  vector  y  can 
be  synthesized  by  the  following  equations: 


y=  I  (/,  V^)+ol 

seCl 


(30) 


a  =  E(y(s)),  x,  =  /pf,*Til,  1  =col.  (1,  1 . 1) 

M2 


(31) 


Survey  of  Texture  Segmentation,  Classification,  and  Synthesis  Methods 


15 


f,  =  col.  [tj,  X,  tj, . ...  Xf  1  tj],  M 2  -  vector 


(32) 


tj  =  col.  [1,  Xj, . .  .,  X**~  '],  M-  vector 


(33) 


Xi  =  exp 


y^i2— 

M 


S  =  O',  y)  • 


(34) 


The  algorithm  requires  G(M2  logA/)  operations.  An  extremely  attractive 
feature  of  this  algorithm  is  the  ability  to  control  a  specific  set  of  parameters 
0r  and  p,  which  determine  the  appearance  of  the  texture.  In  earlier  work 
Chellappa  and  Kashyap  tabulated  the  parameter  values  required  to 
duplicate  several  Brodatz34  textures,  using  as  few  as  1 6  parameters  for 
a  good  representation.  They  also  presented  maximum  likelihood  methods 
for  determining  the  parameters  required  to  fit  their  model  to  a  given 
texture. 

Khotanzad42  used  methods  similar  to  those  of  Chellappa.  He  used  a 
simultaneous  autoregressive  model  given  by: 


for  { g  (x,  y);  x,  y  =  0, . . .,  M  -  1 } 


g(x,  y)  =  I,  e,.7  g(x  ©  i,  y  ©;)  +  <d  (x,  y) ,  (35) 

(i,j)eN 


where  N  is  a  neighbor  set  defined  in  the  spatial  domain.  He  stated  that 
he  used  Chellappa ’s  method  of  maximum  likelihood  for  the  model’s 
parameter  estimation.  Khotanzad  specifically  addressed  texture  classi¬ 
fication  in  his  paper,  using  different  N  models.  Using  one  N  model, 
consisting  of  the  four  horizontal  and  vertical  neighbors,  and  a  second 
that  uses  the  four  diagonal  neighbors,  he  obtained  an  average  correct 
classification  rate  of  98%. 


Cano  et  al.43  proposed  a  method  to  find  a  set  of  texture  parameters 
that  is  visually  complete  and  compact,  using  a  hierarchical  filter  bank 
approach  (multiresolution).  The  process  involves  applying  a  local  mask  H: 


6.2  Stochastic 


Gx\x,  y)  =  Hx\x,  y)  G0  (x,  y);  I  =  1 . N , 


(36) 


where  the  first  H  is  a  low  pass  filter  and  successive  H’s  are  generated 
by 


®  I,.  (37) 

In  equation  37,  ®  denotes  the  Kronecker  product.  The  size  of  the 
mask  increases  geometrically  with  the  level  of  resolution.  The  mean. 
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7.0  Conclusions 


variance,  and  third-degree  moments  are  then  computed  for  each  of  the 
images  G{.  The  number  of  features  that  must  be  computed  is  given  by 


Nf=l n2  (D-  1)  +  D  , 


(38) 


where  L  is  the  number  of  hierarchical  levels,  D  is  the  maximum  degree 
of  moment,  and  n  is  the  size  of  the  mask.  This  method  worked  well  for 
stochastic  textures  but  poorly  for  highly  structured  textures,  since  phase 
information  is  excluded  in  the  filter  bank  technique.  For  highly  structured 
textures  Cano  et  al.  developed  a  translation  invariant  operator  based  on 
the  Fourier  transform.  The  resulting  operator  worked  well  on  periodic 
textures. 

Fan44  used  an  edge-based  hierarchical  algorithm  for  image  segmentation. 
Generalized  likelihood  ratio  like  functions  were  used  as  discriminant 
functions,  and  boundaries  were  located  with  a  maximum  likelihood 
estimator.  This  algorithm  required  no  prior  knowledge  of  the  texture 
model  parameters  or  the  number  of  texture  regions.  The  method  worked 
as  well  as  90%  for  some  textures  but  as  poorly  as  60%  on  others. 

Modestino  et  al.1  discussed  a  texture  discrimination  approach  based 
on  spatial  gray-level  co-occurrences  and  maximum  likelihood  classification 
using  a  log-likelihood  discriminator.  This  approach  proved  to  be  effective 
on  random  fields  with  identical  second  moments,  where  autocorrelation- 
power  spectral  density  and  edge  density-correlation  techniques  were 
ineffective.  Because  autoregressive  models  cannot  account  for  edges 
and  because  stochastic  models  do  not  provide  for  repetition  of  a  local 
pattern,  Modestino  et  al.  used  a  discriminator  that  employs  both  correlation 
and  edge  density  information.  However,  the  discrimination  process 
required  a  knowledge  of  the  model  parameters  for  each  texture.  Another 
disadvantage  in  this  method  is  that  an  “interference”  variable  must  be 
judiciously  set  by  the  user.  A  tradeoff  exists  for  the  strength  of  the 
interference  between  proper  classification  and  ill-defined  points  of 
the  intersection  of  region  boundaries. 

Vickers  and  Modestino45  extended  Modestino’s1  work.  Using  this 
technique,  a  training  set  of  images  is  used  for  each  texture  class  before 
the  unknown  set  is  applied  to  the  classifier.  Required  preprocessing 
of  the  image  included  normalization,  enhancement,  and  noise  cleaning. 
Testing  on  8 -bit  Brodatz34  images  yielded  classification  rates  as  high  as 
98%  over  a  small  data  set.  Vickers  and  Modestino46  previously  outlined 
a  method  to  estimate  the  model  parameters,  whereas  Modestino’s1  paper 
required  a  priori  knowledge  of  these  parameters. 


It  appears  from  this  review  that  no  single  approach  provides  a  robust 
texture  analysis  methodology  without  an  overwhelming  amount  of  com¬ 
plexity.  The  best  approach  to  the  problem  seems  to  be  to  use  a  variety 
of  these  methods  in  their  simplest  and  most  computationally  economic 
form.  The  idea  of  a  multifaceted  approach  to  texture  analysis  is  also 
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supported  by  the  design  of  the  primate  visual  system.  It  has  been  found47 
that  the  visual  center  of  the  primate  brain  contains  multiple  maps  of  the 
visual  field,  each  sensitive  to  particular  aspects,  such  as  motion,  color, 
and  shape. 

Consequently,  a  texture  analysis  toolkit  needs  to  be  formed,  prefer¬ 
ably  containing  the  better  understood  approaches.  Edge  detection  must 
be  included  in  the  toolbox  for  texture  analysis,  since  most  of  the  texture 
approaches  have  difficulty  with  edges  in  an  image.  The  edges  typically 
need  to  be  isolated  so  that  texture  analysis  techniques  can  be  applied  to 
the  regions  between  edges.  The  basic  toolbox  should  probably  include 
edge  detection,  the  gray-level  co-occurrence  matrix  and  its  derivative 
properties,  fractal  dimension,  and  a  Markov  random  field  model,  since 
results  obtained  by  using  these  approaches  are  well  documented.  With 
these  analysis  tools  several  image  features  can  be  extracted.  These 
extracted  features  can  then  be  used  as  input  to  a  human  operator,  to  an 
expert  system,  or  to  a  neural  network  to  perform  the  task  of  image 
interpretation. 

Neural  networks  may  ultimately  provide  the  infrastructure  for  a 
“complete”  visual  system,  incorporating  both  feature  extraction  and  image 
interpretation.  Progress  continues  to  be  made  both  in  the  understanding 
of  the  human  visual  system  and  in  the  development  of  parallel  computing 
machines.  With  the  successful  alliance  of  these  two  fields  of  research, 
engineers  and  scientists  may  eventually  be  able  to  mimic  the  functionality 
of  the  human  visual  system. 
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